34 research outputs found

    Scalable ASL sign recognition using model-based machine learning and linguistically annotated corpora

    Get PDF
    We report on the high success rates of our new, scalable, computational approach for sign recognition from monocular video, exploiting linguistically annotated ASL datasets with multiple signers. We recognize signs using a hybrid framework combining state-of-the-art learning methods with features based on what is known about the linguistic composition of lexical signs. We model and recognize the sub-components of sign production, with attention to hand shape, orientation, location, motion trajectories, plus non-manual features, and we combine these within a CRF framework. The effect is to make the sign recognition problem robust, scalable, and feasible with relatively smaller datasets than are required for purely data-driven methods. From a 350-sign vocabulary of isolated, citation-form lexical signs from the American Sign Language Lexicon Video Dataset (ASLLVD), including both 1- and 2-handed signs, we achieve a top-1 accuracy of 93.3% and a top-5 accuracy of 97.9%. The high probability with which we can produce 5 sign candidates that contain the correct result opens the door to potential applications, as it is reasonable to provide a sign lookup functionality that offers the user 5 possible signs, in decreasing order of likelihood, with the user then asked to select the desired sign

    Linguistically-driven framework for computationally efficient and scalable sign recognition

    Full text link
    We introduce a new general framework for sign recognition from monocular video using limited quantities of annotated data. The novelty of the hybrid framework we describe here is that we exploit state-of-the art learning methods while also incorporating features based on what we know about the linguistic composition of lexical signs. In particular, we analyze hand shape, orientation, location, and motion trajectories, and then use CRFs to combine this linguistically significant information for purposes of sign recognition. Our robust modeling and recognition of these sub-components of sign production allow an efficient parameterization of the sign recognition problem as compared with purely data-driven methods. This parameterization enables a scalable and extendable time-series learning approach that advances the state of the art in sign recognition, as shown by the results reported here for recognition of isolated, citation-form, lexical signs from American Sign Language (ASL)

    A new framework for sign language recognition based on 3D handshape identification and linguistic modeling

    Full text link
    Current approaches to sign recognition by computer generally have at least some of the following limitations: they rely on laboratory conditions for sign production, are limited to a small vocabulary, rely on 2D modeling (and therefore cannot deal with occlusions and off-plane rotations), and/or achieve limited success. Here we propose a new framework that (1) provides a new tracking method less dependent than others on laboratory conditions and able to deal with variations in background and skin regions (such as the face, forearms, or other hands); (2) allows for identification of 3D hand configurations that are linguistically important in American Sign Language (ASL); and (3) incorporates statistical information reflecting linguistic constraints in sign production. For purposes of large-scale computer-based sign language recognition from video, the ability to distinguish hand configurations accurately is critical. Our current method estimates the 3D hand configuration to distinguish among 77 hand configurations linguistically relevant for ASL. Constraining the problem in this way makes recognition of 3D hand configuration more tractable and provides the information specifically needed for sign recognition. Further improvements are obtained by incorporation of statistical information about linguistic dependencies among handshapes within a sign derived from an annotated corpus of almost 10,000 sign tokens

    The Importance of 3D Motion Trajectories for Computer-based Sign Recognition

    Get PDF
    Computer-based sign language recognition from video is a challenging problem because of the spatiotemporal complexities inherent in sign production and the variations within and across signers. However, linguistic information can help constrain sign recognition to make it a more feasible classification problem. We have previously explored recognition of linguistically significant 3D hand configurations, as start and end handshapes represent one major component of signs; others include hand orientation, place of articulation in space, and movement. Thus, although recognition of handshapes (on one or both hands) at the start and end of a sign is essential for sign identification, it is not sufficient. Analysis of hand and arm movement trajectories can provide additional information critical for sign identification. In order to test the discriminative potential of the hand motion analysis, we performed sign recognition based exclusively on hand trajectories while holding the handshape constant. To facilitate this evaluation, we captured a collection of videos involving signs with a constant handshape produced by multiple subjects; and we automatically annotated the 3D motion trajectories. 3D hand locations are normalized in accordance with invariant properties of ASL movements. We trained time-series learning-based models for different signs of constant handshape in our dataset using the normalized 3D motion trajectories. Results show significant computer-based sign recognition accuracy across subjects and across a diverse set of signs. Our framework demonstrates the discriminative power and importance of 3D hand motion trajectories for sign recognition, given known handshapes

    Pre-Clinical Myocardial Metabolic Alterations in Chronic Kidney Disease

    No full text
    The risk for cardiovascular events conferred by decreased renal function is curvilinear with exponentially greater increases in risk as estimated glomerular filtration rate (eGFR) declines. In 13 non-diabetic pre-dialysis chronic kidney disease (CKD) patients, we employed quantitative F-18 fluorodeoxyglucose (FDG) positron emission tomography (PET) as a means to measure myocardial metabolic changes

    Higher Order Apriori

    No full text
    Frequent itemset mining (FIM) is a well known technique for discovering relationships between items. Most FIM algorithms are based on first-order associations between items in the same record. Although a few algorithms capable of discovering indirect propositional rules exist, they do not extend beyond secondorder. In addition, although multi-relational ARM discovers higher-order rules, the rules are non-propositional and the algorithm is NP-complete. This article introduces Higher Order Apriori, a novel algorithm for mining higher-order rules. We extend the itemset definition to incorporate k-itemsets up to n th-order, and present our levelwise order-first algorithm: levelwise meaning that the size of k-itemsets increases in each iteration (as with Apriori), and order-first meaning that at each level, itemsets are generated across all orders. Support is calculated based on the order of itemsets and the number of higher-order associations connecting items

    Distributed Higher Order Text Mining

    No full text
    Abstract--The burgeoning amount of textual data in distributed sources combined with the obstacles involved in creating and maintaining central repositories motivates the need for effective distributed information extraction and mining techniques. Recently, as the need to mine patterns across distributed databases has grown, Distributed Association Rule Mining (D-ARM) algorithms have been developed. These algorithms, however, assume that the databases are either horizontally or vertically distributed. In the special case of databases populated from information extracted from textual data, existing D-ARM algorithms cannot discover rules based on higher-order associations between items in distributed textual documents that are neither vertically nor horizontally distributed, but rather a hybrid of the two. In this article we present D-HOTM, a framework and system for Distributed Higher Order Text Mining. Unlike existing algorithms, those encapsulated in D-HOTM require neither full knowledge of the global schema nor that the distribution of data be horizontal or vertical. D-HOTM discovers rules based on higher-order associations between distributed database records containing the extracted entities. A theoretical framework for reasoning about record linkage is provided to support the discovery of higher-order associations. In order to handle record linkage, the traditional evaluation metrics employed in ARM are extended. The implementation of D-HOTM is based on the TMI and tested on a cluster at the National Center for Supercomputing Applications (NCSA). A sample manual run, results of experimental runs on the NCSA clusters, and theoretical comparisons demonstrate the performance and relevance of D-HOTM in e-marketplaces, law enforcement and homeland defense
    corecore